Skip to content

lnwallet/channeldb: restore fee updates after restart#10622

Draft
ziggie1984 wants to merge 1 commit intolightningnetwork:masterfrom
ziggie1984:fix/restore-fee-updates-after-restart
Draft

lnwallet/channeldb: restore fee updates after restart#10622
ziggie1984 wants to merge 1 commit intolightningnetwork:masterfrom
ziggie1984:fix/restore-fee-updates-after-restart

Conversation

@ziggie1984
Copy link
Collaborator

Summary

  • Fix a bug where UpdateFee entries persisted under remoteUnsignedLocalUpdatesKey were silently skipped during restoreStateLogs, leaving the update log empty after a restart
  • This caused a retransmitted commit_sig (covering the new fee rate) to fail verification after channel reestablishment, incorrectly triggering a force close
  • Refactor AdvanceCommitChainTail to avoid an early return when unsignedAckedUpdates is nil, so the full validation path always runs
  • Adds TestChanSyncUpdateFeeRestartAfterRevoke to reproduce the scenario

Root Cause

During ReceiveRevocation, the fee update is persisted under remoteUnsignedLocalUpdatesKey. However, restoreStateLogs only handles HTLC-type updates (UpdateFulfillHTLC, UpdateFailHTLC, etc.) and silently skips UpdateFee via default: continue. After restart the update log is empty, the expected commitment is computed with the old fee rate, and the signature verification fails.

Test Plan

  • TestChanSyncUpdateFeeRestartAfterRevoke passes
  • Existing channel sync tests still pass

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a subtle but critical issue in channel state management where fee updates were not properly restored from disk after a node restart. This oversight could lead to commitment signature verification failures during channel reestablishment, resulting in unnecessary channel force closures. The changes ensure that all pending fee updates are correctly reloaded, maintaining channel consistency and preventing erroneous state discrepancies.

Highlights

  • Bug Fix: Resolved a critical bug where UpdateFee entries persisted under remoteUnsignedLocalUpdatesKey were incorrectly skipped during restoreStateLogs after a node restart.
  • Impact Mitigation: This fix prevents retransmitted commit_sig messages (covering new fee rates) from failing verification after channel reestablishment, thereby avoiding incorrect force closures.
  • Code Refactoring: Refactored the AdvanceCommitChainTail function to ensure that the full validation path always executes, even when unsignedAckedUpdates is nil, by removing an early return.
  • New Test Case: Added TestChanSyncUpdateFeeRestartAfterRevoke to specifically reproduce and validate the fix for the described restart scenario involving fee updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • channeldb/channel.go
    • Modified AdvanceCommitChainTail to handle cases where unsignedAckedUpdatesKey data is nil without an early return, ensuring deserialization logic proceeds correctly.
  • lnwallet/channel_test.go
    • Added TestChanSyncUpdateFeeRestartAfterRevoke to simulate a channel restart scenario where fee updates need to be correctly restored and verified.
Activity
  • The pull request author has outlined a test plan, including running the new test TestChanSyncUpdateFeeRestartAfterRevoke and ensuring existing channel sync tests pass.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ziggie1984 ziggie1984 linked an issue Mar 3, 2026 that may be closed by this pull request
2 tasks
@lightninglabs-deploy lightninglabs-deploy added the severity-critical Requires expert review - security/consensus critical label Mar 3, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical bug where fee updates were not correctly restored after a restart, potentially leading to incorrect force closures. The refactoring in AdvanceCommitChainTail to prevent an early return is a clean and effective fix, ensuring that all pending local updates are persisted correctly. The addition of TestChanSyncUpdateFeeRestartAfterRevoke is excellent, as it thoroughly covers the specific failure scenario and confirms the fix. The changes are well-implemented and improve the overall robustness of the channel state management.

Note: Security Review is unavailable for this PR.

Fix a bug where UpdateFee entries persisted under
remoteUnsignedLocalUpdatesKey were silently skipped during
restoreStateLogs, leaving the update log empty after a restart.

This caused a retransmitted commit_sig (covering the new fee rate)
to fail verification after channel reestablishment, incorrectly
triggering a force close.

Also refactor AdvanceCommitChainTail to avoid an early return when
unsignedAckedUpdates is nil, so that the full validation path always
runs.

Adds TestChanSyncUpdateFeeRestartAfterRevoke (lnwallet) and
TestChannelLinkFailDuringRestart (htlcswitch) to reproduce the
scenario at both the channel and link layers.
@ziggie1984 ziggie1984 force-pushed the fix/restore-fee-updates-after-restart branch from 952f500 to 39b8699 Compare March 3, 2026 08:33
@lightninglabs-deploy lightninglabs-deploy added severity-critical Requires expert review - security/consensus critical and removed severity-critical Requires expert review - security/consensus critical labels Mar 3, 2026
@lightninglabs-deploy
Copy link
Collaborator

🔴 PR Severity: CRITICAL

Automated classification | 1 non-test file | 18 lines changed

🔴 Critical (1 file)
  • channeldb/channel.go - Channel state persistence; channeldb/* is always CRITICAL
🟢 Low (2 files — excluded from classification)
  • htlcswitch/link_test.go - Test file (*_test.go), excluded
  • lnwallet/channel_test.go - Test file (*_test.go), excluded

Analysis

The only non-test file changed is channeldb/channel.go, which falls under the channeldb/* package. This package handles channel state persistence and database migrations, placing it squarely in the CRITICAL tier requiring expert review.

The two other changed files are test files (*_test.go) in the htlcswitch and lnwallet packages. Per classification rules, test files are LOW severity and excluded from line/file counting for bump evaluation.

Bump check: 1 non-test file (threshold: >20) and 18 non-test lines changed (threshold: >500) — no severity bump applied.


To override, add a severity-override-{critical,high,medium,low} label.

Copy link

@Bortlesboat Bortlesboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 39b8699

Good catch — I've seen this exact pattern cause spurious force closes on my node after fee spikes. The channeldb fix is clean. Left a couple of notes on the link test.

time.Sleep(5 * time.Second)

case <-time.After(5 * time.Second):
t.Fatalf("did not receive channel_reestablish message")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time.Sleep(5 * time.Second) here to wait for the force close processing feels brittle — if the test runner is slow this might not be enough, and on fast machines it just wastes time. Could this use a channel signal or poll on a condition instead? Something like piping through the OnChannelFailure callback into a linkErrors chan (like TestChannelLinkUpdateCommitFee does) and selecting on that with a timeout would be more deterministic.

// TestChannelLinkFailDuringRestart tests that the link fails (force-closing
// the channel) if the commit dance is not completed after sharing the
// update_fee and one of the peers restarts before completion.
//

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the test doc comment says "tests that the link fails (force-closing the channel)" but after the fix the assertion is that force close does NOT happen. Might want to update the comment to reflect the corrected behavior, something like "tests that the link does not force-close after a restart when a fee update was pending".

lnwire.ShortChannelID, LinkFailureError) {
OnChannelFailure: func(_ lnwire.ChannelID,
_ lnwire.ShortChannelID, linkErr LinkFailureError) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes shared test infrastructure — restartLink is used by ~15 other tests. Adding require.NotEqual(t, linkErr.FailureAction, LinkFailureForceClose) here means any future test that legitimately expects a force close through this path would silently break. Safer to keep the no-op default and override OnChannelFailure in TestChannelLinkFailDuringRestart specifically, like TestChannelLinkUpdateCommitFee does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

severity-critical Requires expert review - security/consensus critical

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug]: Force close triggered on restart during incomplete commit dance

3 participants